Search CORE

77 research outputs found

Allocating Divisible Resources on Arms with Unknown and Random Rewards

Author: Chen Ningyuan
Li Wenhao
Publication venue
Publication date: 28/06/2023
Field of study

We consider a decision maker allocating one unit of renewable and divisible resource in each period on a number of arms. The arms have unknown and random rewards whose means are proportional to the allocated resource and whose variances are proportional to an order

b

of the allocated resource. In particular, if the decision maker allocates resource

A_i

to arm

i

in a period, then the reward

Y_i

Y_i(A_i)=A_i \mu_i+A_i^b \xi_{i}

, where

\mu_i

is the unknown mean and the noise

\xi_{i}

is independent and sub-Gaussian. When the order

b

ranges from 0 to 1, the framework smoothly bridges the standard stochastic multi-armed bandit and online learning with full feedback. We design two algorithms that attain the optimal gap-dependent and gap-independent regret bounds for

b\in [0,1]

, and demonstrate a phase transition at

b=1/2

. The theoretical results hinge on a novel concentration inequality we have developed that bounds a linear combination of sub-Gaussian random variables whose weights are fractional, adapted to the filtration, and monotonic

arXiv.org e-Print Archive

Competition among Parallel Contests

Author: Deng Xiaotie
Li Ningyuan
Li Weian
Qi Qi
Publication venue
Publication date: 27/10/2022
Field of study

We investigate the model of multiple contests held in parallel, where each contestant selects one contest to join and each contest designer decides the prize structure to compete for the participation of contestants. We first analyze the strategic behaviors of contestants and completely characterize the symmetric Bayesian Nash equilibrium. As for the strategies of contest designers, when other designers' strategies are known, we show that computing the best response is NP-hard and propose a fully polynomial time approximation scheme (FPTAS) to output the

\epsilon

-approximate best response. When other designers' strategies are unknown, we provide a worst case analysis on one designer's strategy. We give an upper bound on the utility of any strategy and propose a method to construct a strategy whose utility can guarantee a constant ratio of this upper bound in the worst case.Comment: Accepted by the 18th Conference on Web and Internet Economics (WINE 2022

arXiv.org e-Print Archive

Development of a probabilistic based, integrated pavement management system

Author: Li Ningyuan
Publication venue: 'University of Waterloo'
Publication date: 01/01/1997
Field of study

University of Waterloo's Institutional Repository

Revenue Maximization and Learning in Products Ranking

Author: Chen Ningyuan
Li Anran
Yang Shuoguang
Publication venue
Publication date: 07/12/2020
Field of study

We consider the revenue maximization problem for an online retailer who plans to display a set of products differing in their prices and qualities and rank them in order. The consumers have random attention spans and view the products sequentially before purchasing a ``satisficing'' product or leaving the platform empty-handed when the attention span gets exhausted. Our framework extends the cascade model in two directions: the consumers have random attention spans instead of fixed ones and the firm maximizes revenues instead of clicking probabilities. We show a nested structure of the optimal product ranking as a function of the attention span when the attention span is fixed and design a

1/e

-approximation algorithm accordingly for the random attention spans. When the conditional purchase probabilities are not known and may depend on consumer and product features, we devise an online learning algorithm that achieves

\tilde{\mathcal{O}}(\sqrt{T})

regret relative to the approximation algorithm, despite of the censoring of information: the attention span of a customer who purchases an item is not observable. Numerical experiments demonstrate the outstanding performance of the approximation and online learning algorithms

arXiv.org e-Print Archive

Algorithmic Decision-Making Safeguarded by Human Knowledge

Author: Chen Ningyuan
Hu Ming
Li Wenhao
Publication venue
Publication date: 20/11/2022
Field of study

Commercial AI solutions provide analysts and managers with data-driven business intelligence for a wide range of decisions, such as demand forecasting and pricing. However, human analysts may have their own insights and experiences about the decision-making that is at odds with the algorithmic recommendation. In view of such a conflict, we provide a general analytical framework to study the augmentation of algorithmic decisions with human knowledge: the analyst uses the knowledge to set a guardrail by which the algorithmic decision is clipped if the algorithmic output is out of bound, and seems unreasonable. We study the conditions under which the augmentation is beneficial relative to the raw algorithmic decision. We show that when the algorithmic decision is asymptotically optimal with large data, the non-data-driven human guardrail usually provides no benefit. However, we point out three common pitfalls of the algorithmic decision: (1) lack of domain knowledge, such as the market competition, (2) model misspecification, and (3) data contamination. In these cases, even with sufficient data, the augmentation from human knowledge can still improve the performance of the algorithmic decision

arXiv.org e-Print Archive

Equilibrium Analysis of Customer Attraction Games

Author: Deng Xiaotie
Li Ningyuan
Li Weian
Qi Qi
Publication venue
Publication date: 14/07/2023
Field of study

We introduce a game model called "customer attraction game" to demonstrate the competition among online content providers. In this model, customers exhibit interest in various topics. Each content provider selects one topic and benefits from the attracted customers. We investigate both symmetric and asymmetric settings involving agents and customers. In the symmetric setting, the existence of pure Nash equilibrium (PNE) is guaranteed, but finding a PNE is PLS-complete. To address this, we propose a fully polynomial time approximation scheme to identify an approximate PNE. Moreover, the tight Price of Anarchy (PoA) is established. In the asymmetric setting, we show the nonexistence of PNE in certain instances and establish that determining its existence is NP-hard. Nevertheless, we prove the existence of an approximate PNE. Additionally, when agents select topics sequentially, we demonstrate that finding a subgame-perfect equilibrium is PSPACE-hard. Furthermore, we present the sequential PoA for the two-agent setting

arXiv.org e-Print Archive

Competition among Pairwise Lottery Contests

Author: Deng Xiaotie
Gan Hangxin
Li Ningyuan
Li Weian
Qi Qi
Publication venue
Publication date: 20/12/2023
Field of study

We investigate a two-stage competitive model involving multiple contests. In this model, each contest designer chooses two participants from a pool of candidate contestants and determines the biases. Contestants strategically distribute their efforts across various contests within their budget. We first show the existence of a pure strategy Nash equilibrium (PNE) for the contestants, and propose a polynomial-time algorithm to compute an

\epsilon

-approximate PNE. In the scenario where designers simultaneously decide the participants and biases, the subgame perfect equilibrium (SPE) may not exist. Nonetheless, when designers' decisions are made in two substages, the existence of SPE is established. In the scenario where designers can hold multiple contests, we show that the SPE exists under mild conditions and can be computed efficiently.Comment: Accepted by the 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024

arXiv.org e-Print Archive

On the complexity of computing Markov perfect equilibrium in general-sum stochastic games

Author: Deng Xiaotie
Li Ningyuan
Mguni David
Wang Jun
Yang Yaodong
Publication venue: OXFORD UNIV PRESS
Publication date: 22/11/2022
Field of study

Similar to the role of Markov decision processes in reinforcement learning, Markov games (also called stochastic games) lay down the foundation for the study of multi-agent reinforcement learning and sequential agent interactions. We introduce approximate Markov perfect equilibrium as a solution to the computational problem of finite-state stochastic games repeated in the infinite horizon and prove its PPAD-completeness. This solution concept preserves the Markov perfect property and opens up the possibility for the success of multi-agent reinforcement learning algorithms on static two-player games to be extended to multi-agent dynamic games, expanding the reign of the PPAD-complete class

UCL Discovery

PubMed Central

Deep Learning is Provably Robust to Symmetric Label Noise

Author: Chen Li
Huang Ningyuan
Mu Cong
Priebe Carey E.
Villar Soledad
Publication venue
Publication date: 26/10/2022
Field of study

Deep neural networks (DNNs) are capable of perfectly fitting the training data, including memorizing noisy data. It is commonly believed that memorization hurts generalization. Therefore, many recent works propose mitigation strategies to avoid noisy data or correct memorization. In this work, we step back and ask the question: Can deep learning be robust against massive label noise without any mitigation? We provide an affirmative answer for the case of symmetric label noise: We find that certain DNNs, including under-parameterized and over-parameterized models, can tolerate massive symmetric label noise up to the information-theoretic threshold. By appealing to classical statistical theory and universal consistency of DNNs, we prove that for multiclass classification,

L_1

-consistent DNN classifiers trained under symmetric label noise can achieve Bayes optimality asymptotically if the label noise probability is less than

\frac{K-1}{K}

, where

K \ge 2

is the number of classes. Our results show that for symmetric label noise, no mitigation is necessary for

L_1

-consistent estimators. We conjecture that for general label noise, mitigation strategies that make use of the noisy data will outperform those that ignore the noisy data

arXiv.org e-Print Archive